08 mayo 2022

Introduction

  • Dataset: MiRNA expression in squamous cell carcinoma and adenocarcinoma of the esophagus and associations with survival.

    • Cancerous and noncancerous tissues.
    • Cancer stages.
    • 100 ADC and 70 SCC patients.
    • North American and Japanese cohorts.
    • Other variables (tabacco, alcohol)


  • Objective of the project: To reproduce the findings of the authors and elaborate on their visualizations.

Materials and methods

Materials and methods

Flowchart of the data wrangling and analysis

Results

Survival probability for different groups

Results

Volcano plots

Results

Differential expression: our findings

Results

Differential expression: author’s findings

Results

Principal component analysis in cancerous sample

Discussion

  • Data quality: Absence of the different North American cohorts hindered the reproducibility of the article.

  • Data wrangling: Complex data set which required decision-making regarding the selection of relevant probes, NA values, and others.

  • Survival: Produced Kaplan-Meier curves that highlighted a correlation between survival and both tumor types and SEER stage.

  • Differential expression: We weren’t able to reproduce the results, but we found more differentially expressed miRNAs.

  • Principal Component Analysis: PCA shows the data can by grouped by cohort with minor overlap. Overall, there is not a clear clustering of the data based on the mean expression of miRNA.